Frequency of occurrence of phonemes and syllables in Thai: Analysis of spoken and written corpora
نویسندگان
چکیده
This work provides detailed frequency and distribution of Thai phonemes, biphones, and syllable types drawn from three large-scale Thai corpora (InterBEST, LOTUS-BN, and LOTUS-Cell 2.0). Comparisons are carried out to examine an extent to which linguistic variation, associated with different corpus types (written vs. spoken), affects frequency statistics and distribution patterns. Results and statistical analysis show that there is a high correlation in terms of occurrence frequency and distribution in the case of tones and syllable types. However, large degrees of discrepancy exist among the data sets of initial consonants, vowels, and final consonants. Comparisons of this type are needed for other languages to reliably show the degrees to which different types of language corpus and linguistic variation contribute to variability in phoneme frequency and distribution.
منابع مشابه
Comparing Syllable Frequencies in Corpora of Written and Spoken Language
In this study, various German language corpora were compared in order to discover the extent to which syllable frequencies remain stable across different contexts and modalities. Although considerable differences in relative frequency were found among the more common syllables, rank numbers proved to be more robust. Variation across corpora was mostly due to vocabulary characteristics of partic...
متن کاملA Corpus-Based Study of Phoneme Distribution in Thai
This paper presents steps in accessing Thai phoneme distribution from large-scale written Thai corpora. The data were from 12 text genres from InterBEST [1], considered the biggest Thai corpora. Each word was transliterated using the grapheme-to-phoneme software [2]. Then, frequency of words, frequency of 81 Thai phonemes in each genre, and the 95% CIs of average occurrences of each phoneme wer...
متن کاملMetadiscourse Elements in English Research Articles Written by Native English and Non-native Iranian Writers in Applied Linguistics and Civil Engineering
This study investigated metadiscourse and its subcategories in English research articles (RAs) written by nonnative (Iranian) and native English writers from the two disciplines of applied linguistics and civil engineering. The study aimed at seeing whether language and discipline influenced the frequency of occurrence of metadiscourse elements in research articles. To this end, a sample of 120...
متن کاملA Comparative Analysis of Lexical Bundles in Journalistic Writing in English and Persian: A Contrastive Linguistic Perspective
This paper investigates the use of ‘lexical bundles’ in two broad corpora of journalistic writing. The aim of this study is to compare the use of lexical bundles in the two domains, one consisted of newspaper articles written in English and published in England and the other one comprised of newspaper articles written in Persian from Iranian publications. For this purpose, the frequency...
متن کاملSemantic processing survey of spoken and written words in adolescents with cerebral palsy: Evidence from PALPA word-picture matching test
Objective: The present study aimed to assess and compare semantic processing of spoken and written words in adolescents with cerebral palsy and healthy adolescents. Method: The present study is quantitative in terms of type and experimental in terms of method. Examination Group consisted 30 adolescents with cerebral palsy aged 10 to 15 years were selected by convenience sampling method. All of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015